Positive, Negative, or Neutral: Learning an Expanded Opinion Lexicon from Emoticon-Annotated Tweets
نویسندگان
چکیده
We present a supervised framework for expanding an opinion lexicon for tweets. The lexicon contains part-of-speech (POS) disambiguated entries with a three-dimensional probability distribution for positive, negative, and neutral polarities. To obtain this distribution using machine learning, we propose word-level attributes based on POS tags and information calculated from streams of emoticonannotated tweets. Our experimental results show that our method outperforms the three-dimensional word-level polarity classification performance obtained by semantic orientation, a state-of-the-art measure for establishing world-level sentiment.
منابع مشابه
SeNTU: Sentiment Analysis of Tweets by Combining a Rule-based Classifier with Supervised Learning
We describe a Twitter sentiment analysis system developed by combining a rule-based classifier with supervised learning. We submitted our results for the message-level subtask in SemEval 2015 Task 10, and achieved a F1-score of 57.06%. The rule-based classifier is based on rules that are dependent on the occurrences of emoticons and opinion words in tweets. Whereas, the Support Vector Machine (...
متن کاملAnnotate-Sample-Average (ASA): A New Distant Supervision Approach for Twitter Sentiment Analysis
The classification of tweets into polarity classes is a popular task in sentiment analysis. State-of-the-art solutions to this problem are based on supervised machine learning models trained from manually annotated examples. A drawback of these approaches is the high cost involved in data annotation. Two freely available resources that can be exploited to solve the problem are: 1) large amounts...
متن کاملBuilding a Twitter opinion lexicon from automatically-annotated tweets
Opinion lexicons, which are lists of terms labelled by sentiment, are widely used resources to support automatic sentiment analysis of textual passages. However, existing resources of this type exhibit some limitations when applied to social media messages such as tweets (posts in Twitter), because they are unable to capture the diversity of informal expressions commonly found in this type of m...
متن کاملSentiment Analysis on Twitter through Topic-Based Lexicon Expansion
Supervised learning approaches are domain-dependent and it is costly to obtain labeled training data from different domains. Lexiconbased approaches enjoy stable performance across domains, but often cannot capture domain-dependent features. It is also hard for lexiconbased classifiers to identify the polarities of abbreviations and misspellings, which are common in short informal social text b...
متن کاملSentiment of Emojis
There is a new generation of emoticons, called emojis, that is increasingly being used in mobile communications and social media. In the past two years, over ten billion emojis were used on Twitter. Emojis are Unicode graphic symbols, used as a shorthand to express concepts and ideas. In contrast to the small number of well-known emoticons that carry clear emotional contents, there are hundreds...
متن کامل